Pipeline for Real-time Anomaly Detection in Log Data Streams using Apache Kafka and Apache Spark
نویسندگان
چکیده
منابع مشابه
Real-time News Recommendations using Apache Spark
Recommending news articles is a challenging task due to the continuous changes in the set of available news articles and the contextdependent preferences of users. Traditional recommender approaches are optimized for analyzing static data sets. In news recommendation scenarios, characterized by continuous changes, high volume of messages, and tight time constraints, alternative approaches are n...
متن کاملReal-time Log Query Interface for large datasets using Apache Spark
1University of California, Los Angeles {sandha,xinxu129,yuexin,lizhehan}@cs.ucla.edu ABSTRACT Log Query Interface is an interactive web application that allows users to query the very large data logs of MobileInsight easily and efficiently. With this interface, users no longer need to talk to the database through command line queries, nor to install the MobileInsight client locally to fetch dat...
متن کاملApache Spark and Apache Kafka at the Rescue of Distributed RDF Stream Processing Engines
Due to the growing need to timely process and derive valuable information and knowledge from data produced in the Semantic Web, RDF stream processing (RSP) has emerged as an important research domain. In this paper, we describe the design of an RSP engine that is built upon state of the art Big Data frameworks, namely Apache Kafka and Apache Spark. Together, they support the implementation of a...
متن کاملApproximate Stream Analytics in Apache Flink and Apache Spark Streaming
Approximate computing aims for efficient execution of workflows where an approximate output is sufficient instead of the exact output. The idea behind approximate computing is to compute over a representative sample instead of the entire input dataset. Thus, approximate computing — based on the chosen sample size — can make a systematic trade-off between the output accuracy and computation effi...
متن کاملA comparison on scalability for batch big data processing on Apache Spark and Apache Flink
*Correspondence: [email protected] 1Department of Computer Science and Artificial Intelligence, CITIC-UGR (Research Center on Information and Communications Technology), University of Granada, Calle Periodista Daniel Saucedo Aranda, 18071 Granada, Spain Full list of author information is available at the end of the article Abstract The large amounts of data have created a need for new fram...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: International Journal of Computer Applications
سال: 2018
ISSN: 0975-8887
DOI: 10.5120/ijca2018917942